An Efficient Metric of Automatic Weight Generation for Properties in Instance Matching Technique

نویسندگان

  • Md. Seddiqui Hanif
  • Rudra Pratap Deb Nath
  • Masaki Aono
چکیده

The proliferation of heterogeneous data sources of semantic knowledge base intensifies the need of an automatic instance matching technique. However, the efficiency of instance matching is often influenced by the weight of a property associated to instances. Automatic weight generation is a non-trivial, however an important task in instance matching technique. Therefore, identifying an appropriate metric for generating weight for a property automatically is nevertheless a formidable task. In this paper, we investigate an approach of generating weights automatically by considering hypotheses: (1) the weight of a property is directly proportional to the ratio of the number of its distinct values to the number of instances contain the property, and (2) the weight is also proportional to the ratio of the number of distinct values of a property to the number of instances in a training dataset. The basic intuition behind the use of our approach is the classical theory of information content that infrequent words are more informative than frequent ones. Our mathematical model derives a metric for generating property weights automatically, which is applied in instance matching system to produce re-conciliated instances efficiently. Our experiments and evaluations show the effectiveness of our proposed metric of automatic weight generation for properties in an instance matching technique.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Vehicle Logo Recognition Using Image Matching and Textural Features

In recent years, automatic recognition of vehicle logos has become one of the important issues in modern cities. This is due to the unlimited increase of cars and transportation systems that make it impossible to be fully managed and monitored by human. In this research, an automatic real-time logo recognition system for moving cars is introduced based on histogram manipulation. In the proposed...

متن کامل

An Approach for Automatic Matching of Descriptive Addresses

Address matching (also called geocoding) is an applied spatial analysis which is frequently used in everyday life. Almost all desktop and web-based GIS environments are equipped with a module to match the addresses expressed in pre-defined standard formats on the map. It is an essential prerequisite for many of the functionalities provided by location-based services (e.g. car navigation). Sever...

متن کامل

A Hybrid Meta-heuristic Approach to Cope with State Space Explosion in Model Checking Technique for Deadlock Freeness

Model checking is an automatic technique for software verification through which all reachable states are generated from an initial state to finding errors and desirable patterns. In the model checking approach, the behavior and structure of system should be modeled. Graph transformation system is a graphical formal modeling language to specify and model the system. However, modeling of large s...

متن کامل

An Efficient Adaptive Boundary Matching Algorithm for Video Error Concealment

Sending compressed video data in error-prone environments (like the Internet and wireless networks) might cause data degradation. Error concealment techniques try to conceal the received data in the decoder side. In this paper, an adaptive boundary matching algorithm is presented for recovering the damaged motion vectors (MVs). This algorithm uses an outer boundary matching or directional tempo...

متن کامل

Fractured Reservoirs History Matching based on Proxy Model and Intelligent Optimization Algorithms

   In this paper, a new robust approach based on Least Square Support Vector Machine (LSSVM) as a proxy model is used for an automatic fractured reservoir history matching. The proxy model is made to model the history match objective function (mismatch values) based on the history data of the field. This model is then used to minimize the objective function through Particle Swarm Optimization (...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1502.03556  شماره 

صفحات  -

تاریخ انتشار 2015